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Final  Report 

Proposal  Title:  A  New  Model  for  the  Estimation  of  Breast  Cancer  Risk 
P.I.:  Maryellen  L.  Giger,  Ph.D. 


INTRODUCTION: 

Cancer  risk  is  the  probability  that  cancer  will  occur  in  a  given  population.  Research 
on  cancer  risk  seeks  to  identify  populations  with  a  high  probability  of  developing  cancer. 

The  goal  of  this  research  is  to  merge  a  computerized  analysis  of  mammograms,  which 
characterizes  the  breast  pattern,  with  information  of  a  woman's  personal  and  family  histories 
into  a  novel  model  for  use  in  estimating  risk  of  breast  cancer.  The  specific  aims  include  1. 
Creating  a  database  of  mammograms,  along  with  tabulated  clinical  information  of  women  at 
low  risk  and  high  risk  for  breast  cancer;  2.  Developing  a  new  model  using  computer 
methods  for  merging  mammographic  information  with  clinical  information;  and  3. 

Evaluating  the  efficacies  of  the  new  model  compared  to  currently  used  methods  of  risk 
assessment.  The  main  hypothesis  to  be  tested  is  that  given  a  group  of  women,  the  new 
computerized  risk  model  that  merges  computerized  analyses  of  mammograms  with  clinical 
information  should  yield  a  novel  way  for  identifying  those  women  at  risk  for  breast  cancer. 
Potential  uses  of  this  innovative  model  include  1)  serving  as  a  means  to  assess  the  cancer  risk 
of  women  undergoing  routine  screening  mammography  and  thus,  identifying  those  women 
that  may  require  closer  scrutiny  and  2)  serving  as  a  means  to  monitor  the  cancer  risk  of 
women  undergoing  chemoprevention  treatments.  The  research  is  novel  in  that  currently 
there  does  not  exist  a  reliable  means  to  assess  the  cancer  risk  of  individual  women  using  both 
mammographic  and  clinical  information.  In  addition,  if  a  woman  knew  that  she  was  at  an 
increased  risk  of  breast  cancer,  it  is  likely  that  she  would  better  comply  with  screening 
mammography  programs.  In  the  future,  a  successful  model  could  also  be  used  to  assess  the 
effect  of  chemoprevention  on  a  women's  parenchymal  pattern  and  thereby,  overall  risk. 


BODY: 

Task  1.  Establishment  of  database  (mos.  1-30) 

The  high-risk  database  was  collected  within  the  University  of  Chicago  Cancer  Risk 
Clinic  and  consists  of  mammograms,  pedigree  information,  epidemiological  data  and 
related  biological  specimens  from  patients  with  a  family  history  of  breast  cancer.  All 
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mammograms  done  sincel990  were  considered  for  collection  for  all  participants 
irrespective  of  their  cancer  status.  For  each,  breast  cancer  risk  assessment  was  performed 
using  both  Gail  and  Claus  models  and  genetic  testing  whenever  possible.  A  low-risk 
database  was  also  collected  from  our  breast  cancer  screening  program  and  includes 
mammograms  and  clinical  information  on  women  undergoing  routine  screening 
mammograms.  The  low  risk  database  was  developed  to  include  women  who  are  age- 
matched  to  reflect  the  age  of  women  in  our  high  risk  database.  We  collected  380  cases 
(yielding  over  1000  films),  which  includes  143  "low  risk"  cases,  222  high/moderate  risk 
cases,  and  35  BRCAl/BRCA2-mutation  carriers.  The  low  risk  and  high/moderate  cases 
were  deemed  to  be  low  or  moderate/high  risk  by  the  use  of  the  clinical  Gail  and  Claus 
models  by  the  University  of  Chicago  Cancer  Risk  Clinic.  In  addition,  the  clinical 
information  of  age  for  each  patient  was  tabulated.  The  mammograms  are  converted  to 
digital  format  by  using  a  laser  film  scanner  (2048  by  2048  matrix  with  12-bit 
quantization).  Such  high  spatial  resolution  is  necessary  in  order  to  adequately  retain  the 
high-frequency  texture  patterns. 

Task  2.  Development  of  risk  model  including  mammographic  markers  and  clinical 
information  Imps.  3-30) 

Computerized  analysis  of  the  parenchymal  pattern  is  based  on  various  texture 
analysis  methods  we  have  developed  in  our  laboratory  including  Fourier  spectra  analysis, 
histogram  analysis,  and  artificial  neural  networks.  Fourteen  image  features  were 
extracted  within  the  regions  of  each  digitized  mammogram.  These  features  can  be 
grouped  into  (i)  features  based  on  the  absolute  values  of  the  gray  levels,  (ii)  features 
based  on  gray-level  histogram  analysis,  (iii)  features  based  on  the  Fourier  transform,  and 
(iv)  features  based  on  the  spatial  relationship  among  gray  levels. 

We  employed  three  different  approaches  to  relate  these  mammographic  features 
to  breast  cancer  risk.  In  one  approach,  the  features  were  used  to  distinguish 
mammographic  patterns  seen  in  low-risk  women  from  those  who  inherited  a  mutated 
form  of  the  BRCA1/BRCA2  gene.  In  another  approach,  the  features  were  related  to  risk 
as  determined  from  existing  clinical  models  ( Gail  and  Claus  models).  Stepwise  linear 
discriminant  analysis  was  employed  to  identify  features  that  were  useful  in  differentiating 
between  "low-risk"  women  and  BRCAl/BRCA2-m\italion  carriers.  Stepwise  linear 
regression  analysis  was  employed  to  identify  useful  features  in  predicting  the  risk  as 
estimated  from  the  Gail  and  Claus  models.  In  the  third  approach,  the  features  were  used 
to  characterize  mammographic  patterns  seen  in  low-risk  women  and  in  women  who  have 
breast  cancer.  Stepwise  linear  logistic  regression  was  employed  to  identify  useful 
features  to  differentiate  between  the  mammographic  patterns  of  low-risk  women  and 
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women  with  breast  cancer.  The  relationship  between  the  image  patterns  and  the  risk  of 
developing  breast  cancer  was  identified  based  on  the  odds  ratios  associated  with  these 
image  features.  The  computer-extracted  mammographic  features  identified  from  these 
three  approaches  were  similar.  The  results  from  these  studies  show  that  women  who 
have  dense  breasts  and  whose  mammographic  patterns  are  coarse  and  low  in  contrast 
have  an  increased  risk  of  developing  breast  cancer.  The  consensus  of  the  findings  from 
the  three  different  approaches  substantiated  the  existing  results.  (Presented  CARS  2000) 
Futher  investigation  of  the  gene  carrier  group  resulted  in  a  RSNA  2000  presentation 
(November,  2000)  and  an  "in-press"  Radiology  paper. 

We  also  analyzed  the  contributions  of  age  and  computer-extracted 
mammographic  features  in  the  prediction  of  breast  cancer  risk.  We  assessed  the 
contribution  of  the  computer-extracted  features  to  risk  prediction  in  terms  of  percent 
increase  in  the  prediction  power  (r2)  when  age  (the  single  most  important  risk  factor  for 
breast  cancer)  was  used  alone  and  when  the  mammographic  features  were  included.  The 
inclusion  of  the  mammographic  features  increased  the  prediction  power  (r2)  from  0.08 
and  0.16  (age  alone)  to  0.17  and  0.32,  yielding  an  increase  of  113%  an  d  100%  in  r2  for 
predicting  the  risk  as  estimated  from  the  Gail  and  Claus  models.  The  substantial  increase 
in  r2  indicates  the  important  contribution  of  these  mammographic  features  in  risk 
prediction  and  the  need  to  incorporate  in  predicting  breast  cancer  risk.  (Presented  IWDM 
2000) 

Task  3.  Evaluation  methods  fmos.  20-36) 

Correlation  analysis  was  used  in  evaluating  the  performance  of  the  computer- 
extracted  features  and  the  clinical  features.  Linear  correlation  analysis  was  performed  to 
determine  the  correlation  among  the  output  of  the  new  model  and  the  Gail  risk  model  (or 
Claus  model).  We  used  the  combined  model  based  on  the  first  two  models  (gene 
mutation  vs.  low-risk  and  with  cancer  vs.  without  cancer)  and  evaluated  the  performance 
of  the  combined  measures  using  the  Gail  model. 

We  have  entered  into  a  collaborative  agreement  with  the  University  of  Toronto  to 
analyze  data  from  the  Ontario  Breast  Screening  Program  including  400  case  control  pairs. 
In  a  nested  case-control  database,  the  cases  will  correspond  to  women  who  will  have 
developed  cancer  and  the  control  will  correspond  to  women  who  will  have  stayed  cancer 
free  during  the  period.  We  will  calculate  the  clinical  markers  (e.g.,  Gail)  and  the 
mammographic  features  of  the  initial  examination  prior  to  the  5  to  8  year  follow-up. 
Multivariate  analysis  will  be  used  to  examine  the  relationship  between  the  new  model 
and  risk  of  breast  cancer  while  controlling  for  other  risk  factors  such  as  age  at  menarche 
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and  parity.  A  proportional-hazards  regression  model  will  be  used  to  calculate  the  relative 
risk  for  each  radiographic  marker. 

In  preparation  for  this  analysis,  we  investigated  the  effect  of  ROI  size  on  the 
computer-extracted  parenchymal  texture  features.  The  results  showed  that  the  ability  of 
the  texture  features  to  discriminate  between  high  risk  (BRCA1/BRCA2  mutation  carriers) 
and  low  risk  women  was  dependent  on  ROI  location  but  only  slightly  dependent  on  ROI 
size.  This  work  was  presented  at  the  2002  AAPM  meeting. 


KEY  RESEARCH  ACCOMPLISHMENTS: 


•  Increased  our  database  of  high  and  low  risk  cases,  especially  those  with  positive 
BRCA1/BRCA2  testing. 

•  Verified  the  texture  features  for  characterizing  the  breast  parenchyma  using  three 
different  approaches  —  all  yielding  the  same  result 

•  Performed  initial  study  looking  at  the  contribution  of  age  and  mammographic  features 
to  breast  cancer  risk  prediction 

•  Evaluated  computer-extracted  texture  features  with  respect  to  ROI  location  and  ROI 
size 


REPORTABLE  OUTCOMES: 

1 .  Huo  Z,  Giger  ML,  Olopade  01:  Analysis  of  the  relative  contributions  of 
mammographic  features  and  age  to  breast  cancer  risk  prediction.  Zhimin  Huo, 
Maryellen  L.  Giger  and  Olufunmilayo  I.  Olopade,  Presentation  at  International 
Workshop  on  Digital  Mammography  2000  (Toronto,  Canada) 

2.  Huo  Z,  Giger  ML:  Incorporation  of  clinical  data  into  a  computerized  method  for  the 
assessment  of  mammographic  breast  lesions.  Proceeding  Paper  Proc.  SPIE  2000. 
3979:148-152,  2000. 

3.  Huo  Z,  Giger  ML,  Olopade  OI:  Computerized  analysis  of  mammographic  patterns 
of  women  with  and  without  breast  cancer.  Zhimin  Huo,  Maryellen  L.  Giger  and 
Olufunmilayo  I.  Olopade,  Presentation  at  CARS  2000  (San  Fransico,  CA) 
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4.  Huo  Z,  Giger  ML,  Wolverton  DE,  Zhong  W,  Cummings  S,  Olopade  OI: 
Computerized  analysis  of  mammographic  parenchymal  patterns  for  breast  cancer 
risk  assessment:  Feature  selection.  Journal  Article  Medical  Physics  27:4-12, 

2000. 

5.  Huo  Z,  Giger  ML,  Zhong  W,  Nishikawa,  RE,  Wolverton  DE,  Olopade  OI: 
Mammographic  parenchymal  patterns  as  predictors  for  breast  cancer  risk. 
Presentation  at  86th  Scientific  Assembly  and  Annual  Meeting  of  Radiological 
Society  of  North  America,  Chicago,  Illinois,  2000. 

6.  Huo  Z,  Giger  ML,  Zhong  W,  Olopade  OI:  Analysis  of  relative  contributions  of 
mammographic  features  and  age  to  breast  cancer  risk  prediction.  Proceeding  Paper 
Digital  Mammography  2000.  Proc.  5th  International  Workshop  on  Digital 
Mammography.  Medical  Physics  Publishing.  Wisconsin  pp.  732-736,  2001. 

7.  Huo  Z,  Giger  ML,  Olopade  OI,  Wolverton  DE,  Weber  BL,  Metz  CE,  Cummings  S, 
Zhong  W:  Computerized  analysis  of  digitized  mammograms  of  BRCA1/BRCA2 
gene  mutation  carriers.  Journal  article  Radiology  (in  press),  2002. 

8.  Li  Hui,  Giger  ML,  Huo  Z,  Olopade  O,  Lan  L,  Bonta  I:  Computerized  analysis  of 
mammographic  patterns  for  assessing  breast  cancer  risk:  Effect  of  ROI  size  and 
location.  Presentation  at  2002  AAPM  meeting  in  Montreal,  Canada  ,  2002. 

9.  Giger  ML  has  a  ROI  grant  being  submitted  to  NCI,  which  formally  includes  the 
collaboration  with  the  University  of  Toronto  and  the  Ontario  Breast  Cancer 
Screening  Program  (made  possible  by  the  results  from  the  army  idea  grant). 
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CONCLUSIONS: 


We  have  shown  that  computer-extracted  features  of  mammographic  parenchymal 
patterns  can  be  used  in  the  prediction  of  breast  cancer  risk.  This  has  been  demonstrated 
(on  the  developing  database)  using  three  approaches:  (1)  correlation  with  clinical  models 
of  Gail  and  Claus,  (2)  separation  between  women  at  low  risk  and  those  with  a  positive 
gene  testing  result,  and  (3)  separation  between  women  at  low  risk  and  those  that  have 
breast  cancer.  In  addition,  we  have  shown,  that  the  inclusion  of  the  mammographic 
features  with  age  increase  the  predictive  power  over  the  use  of  age  alone  in  the  prediction 
of  breast  cancer  risk.  We  have  also  shown  that  with  our  method,  the  performance  of  the 
features  and  the  classifier  are  quite  dependent  on  ROI  location  within  the  breast  and  only 
slightly  dependent  on  ROI  size. 
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INTRODUCTION 

Studies  based  on  visual  assessment  and  computerized  assessment  of  mammo¬ 
graphic  patterns  showed  that  increasing  mammographic  density  associated  wife 
Si  increased  risk  of  breast  cancer  on  the  order  of  4.0-6.0  between  the  most  exten 
sive  and  least  extensive  density  patterns  (Wolfe  et  al.  1987,  Boyd  et  al.  1995,  Byrne 
et  al  1995  Byng  et  al.  1997).  Quantitative  computerized  analysis  of  mammo- 
eranhic  patte^provides  objective  classification  of  density  patterns,  while  the 
Snuy  of  visual  assessment  remains  due  to  the  subjective  n«ore  of  hu®m 
observers  (Warner  et  al.  1992).  We  have  developed  computerized  methods  tha 
characterize  mammographic  parenchymal  patterns  of  women  and  relate  these  p 
terns  to  the  risk  of  developing  breast  cancer.  We  have  studied 
parenchymal  patterns  of  cancer-free  women  who  are  at  different  risk 
developing  breast  cancer,  including  BRCA1/BRCA2  mutation  earners, 
who  have  developed  breast  cancer  (Huo  et  al.  2000a  b).  A  total  of  14  mammo 
graphic  features  were  extracted  from  the  central  breast  regio  J 

mammograms  to  characterize  percent  density  of  the  breast  or  the  heterogene^ 
inhomoeeneity  (diffuse)  patterns  in  the  dense  portions  of  the  breast  (Huo  et  aL 
2000b).  Three  different  approaches  have  been  employed  to  rotate ^ese  ma 
graphic  features  to  the  risk  of  developing  breast  cancer  (Huo  et  al.  200^];1"1 
approach,  the  features  were  related  to  risk  as  determined  from  existing  elm 
models  (Gail  and  Claus  models),  which  use  well-known  epidemiologies 
such  as  a  woman’s  age,  her  family  history  of  breast  cancer,  reproductive  h  t  y 
etc  (Gail  and  Benichou  1992,  Claus  et  al.  1993):  Stepwise  linear  regression  an  b 
sis' ^employed  to  identify  useful  features  in  predicting  the  risk  as  est  mateci 
from  the  Gail  and  Claus  models.  Four  selected  featoes^along  with  g^e 
to  Dredict  the  10-year  risks  as  estimated  from  the  Gail  and  the  Uau 
Results  from  linear  regression  analysis  indicated  that  inc«^ were 
mographic  density  and  coarse  and  low  contrast  ma™m0^ap^  P  ,  ationCOef- 
positively  correlated  with  increased  breast  cancer  risk,  yielding  Analysis 

Sdents  (r)  of  0.41  and  0.57  for  the  Gail  and 

of  mammographic  patterns  of  women  who  are  BRC Al /BRC A2  mutat  at 

and  who  have  been  diagnosed  with  breast  cancer  also  suggested  tha 


732 


ANALYSIS  OF  RELATIVE  CONTRIBUTIONS  733 


high  risk  of  developing  breast  cancer  tend  to  have  dense  breasts  and  their  mam- 
mographic  patterns  tend  to  be  coarse  and  low  in  contrast  (Huo  et  al.  2000a, b). 

It  is  important  to  understand  the  contribution  of  these  computer-extracted 
\  mammographic  features  in  predicting  breast  cancer  risk  and  to  study  the  poten- 
j  tials  of  these  features  in  the  prediction  of  breast  cancer  risk  when  they  are 
,  incorporated  with  other  risk  factors  into  a  model.  The  purpose  of  this  study  is 
to  analyze  the  contribution  of  these  computer-extracted  mammographic  features 
to  breast  cancer  risk  prediction  in  comparison  with  that  of  age,  which  is  the  most 
important  single  risk  factor  for  breast  cancer. 

MATERIALS  AND  METHODS 

<*■ 

Database 

r  A  total  of  380  cancer-free  cases  were  included  in  this  study  Retrospective  mam- 
i  mograms  and  information  regarding  the  reproductive  history,  family  history  of 
breast  cancer,  and  history  of  previous  breast  disease  were  collected  for  all  cases  to 
assess  an  individual’s  short-term  risk  (i.e.,  10-year  risk)  of  developing  breast  cancer. 
'^The  10-year  risk  is  defined  as  the  probability  that  a  woman  with  given  risk  factors 
and  given  age  will  develop  breast  cancer  in  the  next  10  years  of  her  life.  In  this  study, 
10-year  risks  of  developing  breast  cancer  risk  were  estimated  for  all  of  the  cases  using 
both  the  Gail  model  and  the  Claus  model.  The  10-year  risk  was  used  for  this  study 
since  the  Claus  model  calculates  short-term  risk  only  up  to  the  10-year  intervals. 
Mammograms  from  these  cases  were  digitized  using  a  Konica  laser  scanner  (LD 
i  4500;  Konica  Medical,  Wayne,  NJ)  at  0.1  mm  pixel  size  and  10-bit  gray-level  scale. 

^  It  should  be  noted  that  the  cases  used  for  the  Gail  and  the  Claus  models  were 
^different  since  not  all  of  the  cases  have  complete  information  required  by  both 
the  Gail  and  the  Claus  models.  Of  the  380  cases,  143  of  them  have  the  10-year 
risk  estimated  from  the  Gail  model  and  303  of  them  have  the  10-year  risk  as  esti- 
i  mated  from  the  Claus  model. 

^Computer-extracted  Mammographic  Features 

I  A  total  of  14  features  were  extracted  from  a  region-of-interest  (ROI)  of  size  256 
,,  pixels  by  256  pixels,  which  was  manually  selected  from  the  central  region  of 
J  the  breast  image.  The  central  breast  region  was  used  because  it  usually  includes 
.the  most  dense  parts  of  the  breast.  Detailed  descriptions  about  these  features  can 
|be  found  in  the  literature  (Huo  et  al.  2000b).  Useful  features  were  then  identified 
.using  the  approach  described  above,  i.e.,  stepwise  linear  regression  analysis  (Huo 
|et<al.  2000b),  to  predict  the  10-year  risk  as  estimated  from  the  Gail  or  the  Claus 
A  total  of  four  computer-extracted  mammographic  features,  along  with  age, 
^were  selected  from  stepwise  linear  regression  analysis.  Age,  skewness,  coarseness, 
*and  contrast  were  selected  for  the  10-year  risk  as  determined  from  the  Gail  model. 
^Age,  skewness,  RMS  variation,  and  coarseness  were  selected  for  the  10-year  risk  as 
determined  from  the  Claus  model.  The  skewness  from  gray-level  histogram  analy¬ 
ses  and  the  root-mean-square  (RMS)  variation  from  the  Fourier  transform  were 
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calculated  to  indicate  the  percent  density.  The  coarseness  and  contrast  features 
were  obtained  based  on  the  spatial  relationship  among  gray  levels  They  were  used 
to  characterize  the  heterogeneity  of  the  dense  tissue  patterns  within  the  ROIs. 

Regression  Analysis  on  Age  and  Mammographic  Features 

To  assess  the  contribution  of  age  alone  in  the  prediction  of  i an  individual’s  10-year 
risk,  linear  regression  on  age  alone  was  performed.  It  should  be  noted  that  age 
is  one  of  the  risk  factors  used  in  both  the  Gail  and  the  Claus  models.  The  rela¬ 
tionship  between  the  10-year  risk  and  age  was  represented  by  a  linear  regression 
model  The  proportion  of  the  total  variation  in  the  10-year  risk  explainable  by 
age  employing  the  linear  regression  model  was  used  to  quantify  the  con  n  u- 
tion”  of  age  alone  in  the  prediction  of  10-year  risk,  as  indicated  by  the  square 

correlation  coefficient,  r2  (Hays  1994).  .  , 

To  assess  the  contribution  of  the  selected  mammographic  features  m  the  pre¬ 
diction  of  10-year  risk,  linear  regression  on  age  and  the  mammograp  ic  ea  ures 
was  performed  to  predict  the  10-year  risks  as  estimated  from  the  Gail  and  he 
Clause  models.  The  relationship  of  the  10-year  risk  with  age  and  the  selected 
mammographic  features  was  represented  by  a  multiple  linear  regression  mo  e  . 
The  “contribution”  from  age  and  the  mammographic  features  together  m  e  p 
diction  of  10-year  risk  was  indicated  by  the  squared  multiple  correlation 

coefficient,  r2  (Hays  1994).  ,  .  ,  inno/„ 

It  should  be  noted  that  the  r2  ranges  from  0  to  1 ,  where  v  - 1  indicates 
of  total  variation  in  the  observed  values  (e.g,  10-year  risk  as  estimated  &o 
Gail  model)  explained  by  the  regression  model  or  by  the  ^dependent  v^iables 
(e.g.,  the  features).  In  other  words,  with  r2  =  1,  all  of  the  observed  values  for  an 
individual’s  10-year  risk  fall  exactly  on  the  straight  line  represente  y 
regression  model. 

Relative  Contribution  of  Mammographic  Features 
in  Comparison  with  Age 

Addition  of  any  features  to  the  regression  model  increases  the  squared  multiple 
correlation  coefficient,  r2.  The  increase  in  r2  measures  the  additional  worth 
the  added  features  but  depends  on  the  feature  already  m  the  model.  The  increase 
in  r2,  when  the  mammographic  features  are  added  to  the  regression  model ,  quan 
tifes  the  percentage  of  the  total  variation  in  the  10-year  risk  explained  by 
mammographic  features  but  not  by  age.  As  mentioned  above  the  conhibuho 
of  age  alone  can  be  quantified  in  terms  of  r2  when  age  is  used  alon^T^eoafd°  p 
tional  contribution  of  mammographic  features  can  be i  quantified  m 
increase,  Ar2,  in  r2  when  these  features  are  added^ The  relative  contnbuhon 
these  mammographic  features  in  the  prediction  of  an  individual  s  10-year 
is  measured  by  the  percent  increase  in  r2,  i.e.,  A r  /r  . 

RESULTS 

•fVi 

The  models  based  on  regression  on  age  alone  and  mammographic  features 
age  are  listed  in  table  1  for  the  10-year  risk  as  estimated  from  the  Gailan 
Claus  models.  It  should  be  noted  that  the  analysis  was  performed  separ 
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Table  I .  Linear  Regression  Models  on  Age  Alone  and  Along 
with  Mammographic  Features 


10-year  risk%  (Gail  model) 
10-year  risk%  (Gail  model) 

10-year  risk%  (Claus  model) 
10-year  risk%  (Claus  model) 


-0.025  +  0.001  age 

-0.03  -  0.004  skew  +  34.51  cos 

-38.31  con  +  0.002  age 

-0.076  +  0.003  age 

-0.09  -  0.013  skew  +  0.002  rms 

-100.52  con  +  0.004  age 


NOTE:  Skew,  cos,  con,  and  rms  correspond  to  the  skewness,  coarse, 
contrast,  and  RMS  variation. 

for  the  risk  as  estimated  from  the  Gail  and  the  Claus  models  using  the  two  dif¬ 
ferent  subsets  of  the  database.  As  shown  in  table  1,  the  10-year  risk  as  estimated 
from  the  Gail  model  (303  cases)  was  positively  correlated  with  age  and  coarse¬ 
ness,  and  was  negatively  correlated  with  skewness  and  contrast,  yielding  a 
correlation  coefficient  of  0.28  (p-value  <  0.001)  when  age  alone  was  used  and  a 
correlation  coefficient  of  0.41  (p-value  <  0.001)  when  the  mammographic  fea¬ 
tures  were  included.  The  10-year  risk  as  estimated  from  the  Claus  model  (143 
cases)  was  positively  correlated  with  age  and  RMS  variation  and  was  negatively 
correlated  with  skewness  and  contrast,  yielding  a  correlation  coefficient  of  0.4 
(p-value  <  0.001)  when  age  was  used  alone  and  a  correlation  coefficient  of  0.57 
(p-value  <  0.001)  when  the  mammographic  features  were  added.  The  results 
imply  that  an  individual’s  10-year  risk  increases  with  age,  with  increasing  mam¬ 
mographic  density,  and  with  coarse,  low-contrast  mammographic  texture  patterns. 

In  terms  of  contribution  measured  by  r2,  regression  on  age  alone  yielded  r2s 
of  0.08  and  0.16  for  the  10-year  risks  as  estimated  from  the  Gail  and  the  Claus 
models,  respectively.  Regression  on  age  and  the  selected  mammographic  fea¬ 
tures  yielded  r2s  of  0.17  and  0.32  for  the  10-year  risk  as  estimated  from  the  Gail 
and  the  Claus  models,  respectively,  which  corresponds  to  increases  of  113%  and 
100%  in  r2. 


DISCUSSION 

Age  has  been  identified  as  the  most  important  risk  predictor  for  breast  cancer 
in  women.  Dense  mammographic  parenchymal  patterns  have  been  identified  as 
.  one  of  the  important  risk  factors  for  breast  cancer.  In  this  paper,  we  studied  the 
association  of  the  10-year  risks  as  estimated  from  the  Gail  and  the  Claus  models 
with  age  and  mammographic  patterns  as  characterized  by  computer-extracted 
„  eatures  using  linear  regression  analysis.  The  contribution  of  age  and  the  mam- 
mographic  features  to  breast  cancer  risk  prediction  was  quantified  in  terms  of 
the  squared  correlation  coefficient,  r2,  i.e.,  the  percentage  of  the  total  variation 
m  the  risk  explainable  by  age  alone  or  together  with  the  mammographic  features 
.The  relative  increases  of  413%  and  100%  in  r2  for  the  10-year  risks  as  estimated 
ifrom  the  Gail  and  the  Claus  models,  respectively,  indicate  that  the  mammo¬ 
graphic  features,  which  were  included  in  the  regression  model,  contributed  as 
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much  as  age  in  the  prediction  of  breast  cancer  risk  as  estimated  from  the  Gail 
and  the  Claus  models,  although  the  results  need  to  be  validated  using  a  larger 
number  of  cases.  Such  a  substantial  contribution  to  the  prediction  of  breast 
cancer  risk,  in  comparison  with  that  of  age,  indicates  the  importance  of  mam- 
mographic  features  in  breast  cancer  risk  prediction,  and  the  need  to  incorporate 
them  into  a  breast  cancer  risk  prediction  model. 
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